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ABSTRACT 

A  parameter  identification  problem  is  considered  in  the  context  of  a  linear  abstract 
Cauchy  problem  with  a  parameter-dependent  evolution  operator.  Conditions  are  investi¬ 
gated  under  which  the  gradient  of  the  state  with  respect  to  a  parameter  possesses  smooth¬ 
ness  properties  which  lead  to  local  convergence  of  an  estimation  algorithm  based  on  quasi¬ 
linearization.  Numerical  results  are  presented  concerning  estimation  of  unknown  parameters 
in  delay-differential  equations. 
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1.  Introduction 


During  the  past  fifteen  years  considerable  effort  has  been  devoted  to 
the  problem  of  estimating  unknown  parameters  in  distributed  parameter 
systems.  The  recent  book  by  Banks  and  Kunisch  1 9]  provides  an  excellent 
account  of  the  progress  made  in  the  field.  Many  parameter  estimation 
problems  are  best  formulated  as  optimization  problems  (often  over  infinite 
dimensional  "parameter  spaces")  and  algorithms  are  developed  to  minimize  an 
appropriate  cost  function.  Although  there  are  several  approaches  to  these 
problems,  their  infinite  dimensional  nature  requires  that  numerical 
approximations  be  introduced  at  some  point  in  the  analysis.  Consequently, 
there  are  two  basic  classes  of  algorithms  for  optimization  based  parameter 
estimation.  The  first  type  of  algorithm,  and  the  most  frequently  used  for 
dynamic  problems,  is  indirect  and  proceeds  by  initially  approximating  the 
dynamic  equations  (e.g.  finite  elements,  finite  differences,  etc.)  and  then 
using  optimization  algorithms  on  the  finite  dimensional  problem.  This 
approach  is  typified  by  the  papers  (1J-16J,  (81,  (10],  and  (17).  The 
second  more  direct  approach  is  based  on  the  direct  application  of  an 
(perhaps  infinite  dimensional)  optimization  algorithm  and  employing 
numerical  approximations  at  each  step  of  the  algorithm  to  compute  the 
necessary  solutions  of  the  dynamic  equations.  This  approach  is  used  in 
(12],  [13],  and  [18].  Both  methods  have  advantages  and  disadvantages. 
Depending  on  the  particular  type  of  distributed  parameter  system,  one 
method  may  out  perform  the  other. 

Direct  methods  such  as  quasi  1 inearization  considered  here  are  often 
limited  by  the  fact  that  the  dependence  on  unknown  parameters  of  the 
solution  to  the  infinite  dimensional  dynamical  equations  may  not  be  "smooth 
enough"  to  establish  convergence  of  the  algorithm.  Indeed,  some  algorithms 
may  not  be  properly  defined  without  this  necessary  smoothness.  Indirect 
methods  avoid  this  difficulty  and  often  lead  to  easily  implemented 
algorithms.  On  the  other  hand,  when  direct  methods  can  be  applied  it  is 
sometimes  possible  to  establish  the  convergence  and  the  rotes  of 
convergence  to  the  unknown  optimal  parameters  (see  [13],  Il8]). 
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This  paper  considers  the  dependence  on  an  unknown  parameter  q  of  the 
solution  of  the  linear  abstract  Cauchy  problem 

f  x(t)  =  A(q)x(t)  +  u(t),  0  <  t  <  T, 

I  x(0)  =  Xq. 

Our  ultimate  goal  is  to  formulate  and  establish  the  convergence  of  a 
gradient-based  parameter  estimation  algorithm  applicable  in  this  abstract 
setting . 

This  algorithm  employs  computation  of  the  gradient  D^x(t;q)  of  the 

solution  of  (1.1)  with  respect  to  the  parameter.  Conditions  for  the 

existence  of  this  gradient  are  established  in  I  111.  In  Section  2  we  review 

these  conditions  and  the  general  setting  for  the  remainder  of  the  paper. 

Convergence  of  the  algorithm  requires  certain  smoothness  properties  of  the 

gradient  D  x(t;q)  with  respect  to  q.  These  properties  are  established  in 
Q 

Section  3  and  their  applicability  to  a  linear  delay-differential  equation 
is  discussed  in  Section  4,  In  this  example  the  delay  is  among  the 
parameters  so  that  in  this  setting  the  parameter  dependence  appears  in 
unbounded  terms  of  the  evolution  operator  A(q). 

An  abstract  parameter  estimation  algorithm  is  presented  in  Section  5. 
In  Section  6  its  convergence  is  established  using  the  results  of  Section  3. 
In  Section  7  we  present  several  numerical  examples  which  indicate  the 
performance  of  the  algorith  to''  delay  and  coefficient  estimation  in  linear 
delay-differential  equations.  Jditional  examples  may  be  found  in  [12], 
Numerical  testing  and  evaluation  on  a  wider  variety  of  parameter  estimation 
problems  will  be  undertaken  in  a  subsequent  paper. 

2.  The  General  Setting 

The  application  of  quasi  1 inearization  to  parameter  estimation  requires 
knowledge  of  the  derivative  of  the  state  with  respect  to  the  unknown 
parameter.  This  topic  is  addressed  in  [111.  In  this  section  we  review  the 
framework  used  there  to  obtain  differentiability  and  establish  notation  to 
be  used  in  the  remainder  of  this  paper. 


-3- 


Let  P  be  an  open  subset  of  a  normed  linear  space  P  with  norm  !•!  and 
let  X  be  a  Banach  space  with  norm  ||•||.  For  every  q  G  P  let  A(q)  be  a 
linear  operator  on  D(A{q))  in  X.  Throughout  this  paper  we  assume 

(HI)  A(q)  generates  a  strongly  continuous  semigroup  S(t;q)  on  X; 

(H2)  D(A(q))  =  D  is  independent  of  q; 

(H3)  l|S(t;q)x||  <  Me^*'||x||,  x  G  X,  t  >  0,  q  G  D,  for  some  constants 

M  and  u  independent  of  q,  x,  and  t. 

Fix  T  >  0  and  u  G  L^(0,T;X).  Define  Q(t;q)  =  f  S( t-s ; q )u( s )ds  for  q  G  P, 

■^0 

0  <  t  <  T.  Note  that  if  (1.1)  has  a  strong  solution  then  it  is  given  by 
the  formula  x(t)  =  S(t:q)xQ  +  Q(t;q)  for  0  <  t  <  T. 

In  applications  of  this  theory  it  is  useful  to  consider  just  those 
terms  of  A(q)  in  which  the  parameter  appears.  To  this  end  we  write 
A(q)  =  A  +  B(q)  where  A  and  B(q)  both  have  domain  D  and  A  is  independent 
of  q.  Concerning  B(q)  we  assume  the  following: 

(H4)  For  every  q,  q^  G  P  there  is  a  constant  K  such  that 

T 

f  ||B(q)S(t;q-)x||dt  <  K||x||  for  all  x  G  D. 

Jq  u 

In  Section  4  we  discuss  an  example  in  which  an  unbounded  operator  B(q) 
satisfies  (H4).  This  hypothesis  does  imply,  however,  that  the  linear 

mapping  x  -»  B(q)S(«;qQ)x  is  bounded  as  a  mapping  from  D  into  L*(0,T;X). 

Let  F(q,qQ)  denote  the  bounded  linear  extension  of  this  operator  to  X.  Let 

ll»||j  denote  the  norm  in  L*(0,T;X).  Concerning  F  we  assume  the  following: 
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(H5)  There  is  closed  subspace  Y  of  X  such  that 

(i)  FCq.QqIXq  g  lVo,T;Y)  for  q,  G  P,  and 

(ii)  for  every  q^  G  P  and  f  >  0  there  exists  S  >  0  such  that 
l|F(q.q0^y  “  for  y  G  Y  and 

|q  -  QqI  - 

The  analogue  of  F  for  the  function  Q(t;q)  is  the  mapping  GCq.q^)  from 
l'(0,T;D)  into  L^0.T;X)  defined  by 

lG(q,qQ)w]( t)  =  J  B(q)S(t-s;qQ)w(s)ds. 

By  (H4)  is  follows  that  G  can  be  extended  to  a  bounded  linear  mapping  on 
L*(0,T;X)  so  that  in  particular  GCq.q^lu  is  defined  as  an  element  of 

L^(0,T;X).  In  addition  we  assume 

(H6)  Glq.q^lu  G  l‘(0,T;Y)  for  q,  q^  G  P 

where  Y  denotes  the  subspace  required  by  (H5) . 

3.  Parameter  Dependence 

In  this  section  we  deduce  smoothness  properties  of  the  solution 
x(t;q)  =  S(t;q)xQ  +  Q(t;q)  with  respect  to  q.  These  properties  are  derived 

from  similar  properties  of  Flq.q^)  and  GCq.q^)  which  are  operators  related 

to  A(q).  These  results  will  be  used  in  Section  5  to  prove  convergence  of 
the  parameter  estimation  algorithm.  Throughout  this  section  T  >  0,  x^  G  X, 
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and  u  6  L*(0,T;X)  are  fixed  as  given  in  (1.1).  The  symbol  denotes 

Frechet  differentiation  with  respect  to  q.  These  results  are  given  as  a 
series  of  lemmas  whose  proofs  are  at  the  end  of  this  section. 

Lemma  3.1.  Suppose  (Hi)  -  (H5)  hold.  In  addition,  suppo.sc  that  for  a 
gi ven  q*  e  P 


(H7)  F(q,qQ)xQ  is  Frechet  differentiable  with  respect  to  q  at  q^ 
for  every  q^  6  P. 


For  brevity, 


let  DF(qQ)  denote  D  I F(q 1 1  for  q^  €  P 


In  add i t ion  , 


suppose 


(H8)  DF(q)  is  strongly  continuous  in  q  at  q*,  that  is,  for  each 

h  €  P  the  mapping  q  -♦  DF(q)h  from  P  into  l'(0,T;X)  is 
continuous  at  q* . 


Then  for  each  t  e  10, T],  S(t;q)xQ  is  Frechet  diffentiable  with  respect  to  q 
at  every  q  e  P  and  D^l S( t.qlx^ 1  is  strongly  continuous  with  respect  to  q 
at  q*  . 


Lemma  3.2.  Suppose  (Hi)  -  (H6)  hold  and  in  addition  suppose  that  for  a 
given  q*  g  P, 


(H9)  G(q,qjj)u  is  Frechet  differentiable  with  respect  to  q  at 
for  every  q^  6  P. 

Again  denoting  this  derivative  by  DGiq^)  for  q^  g  P,  assume 


(HIO)  DG(q)  is  strongly  continuous  in  q  at  q* . 
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Then  for  t  €  [0,T1,  Q(t;q)  is  Frechet  differentiable  with  respect  to  q  at 
every  q  6  P  and  D^(Q(t;q)]  is  strongly  continuous  in  q  at  q*. 

Lenima  3.3.  Suppose  (Hi)  -  (H5)  and  (H7)  hold  and  in  addition  suppose 

(Hll)  F(q,q*)  is  locally  Lipschitz  continuous  in  q  at  q* ,  uniformly 

for  y  e  Y,  that  is,  there  exist  constants  ,  6^  >  0  such  that 

|lF(q,q*)  -  F(q,q*)yllj  <  Kjq  -  q*  |  l|y|| 
whenever  |q  -  q*l  <  and  y  e  Y. 

Moreover,  assume  that 

(H12)  DF(q)  is  strongly  locally  Li pschi t z  continuous  with  respect 

to  q  at  q*.  That  is,  for  each  h  e  P,  there  arc  constants 
K,  6  >  0  such  that 

|lDF(q)h  -  DF(q*)hli  <  K|q  -  q*  | 

/or  |q  -  q*|  < 

Then  D^[S(t;q)xQ]  is  strongly  locally  Lipschitz  continuous  with  respect  to 
q  at  q*  for  every  t  F  (0,T|. 

Lemma  3.4.  Suppose  (HI)  -  (H6),  (H9)  -  (HIO)  hold  and  in  addition  suppose 

(H13)  DG(q)  is  strongly  locally  Lipschitz  continuous  with 
respect  to  q  at  q*. 

Then  D^rQ(t;q)]  is  strongly  locally  Lipschitz  continuous  with  respect  to  q 
at  q*  for  every  t  e  [0,TJ . 
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A1 though  the  assumptions  (HI)  -  (H13)  are  rather  technical,  we  shall 
see  that  they  can  be  easily  verified  for  delay  systems  even  in  the  case 
that  the  unknown  parameter  is  the  delay  itself.  Therefore,  the  results 
presented  here  remove  the  limitations  placed  on  the  perturbation  B(q)  in 
papers  [ 13]  and  I  16] . 

For  completeness  we  now  present  the  proofs  of  Lemma  3.1  -  Lemma  3.4. 
However,  these  proofs  make  use  of  the  basic  results  found  in  [111  and  in 
order  to  keep  the  length  of  the  proofs  reasonable  we  assume  that  the  reader 
has  [ 11 ]  in  hand . 

Proof  of  Lemma  3.1.  It  is  shown  in  Ill]  that  (HI)  -  (H5),  (H7)  imply  that 
D^|S(t;q)xQ]  exists  for  q  €  P.  Furthermore,  it  is  given  by  the  formula 


(3.1)  D  [S(t;q)x_]h  =  S( t-s ;q ) (DF(q )h ] ( s )ds ,  h  e  P. 

q  ^  Jq 

We  therefore  obtain  by  substitution 


(3.2)  D^[S(t;q)xQlh  -  D^l S( t ; q* )Xq )h 

=  f  [S(t-8;q)  -  S(t-s;q*) |(|DF(q)hl(s))ds 

'’o 

+  f  S(t-s;q*)([DF(q)h|(s)  -  I  DF( q* )h 1( s ) )ds . 
*^0 


Let  e  >  0  he  given  and  let  C  =  Me^^.  It  can  be  shown  (see  the  proof  of 
Theorem  1  [111)  that  for  all  x  €  X 


(3.3) 


||S(t;q)x  -  S(t;q*)x|l  <  C||F(q,q*)x  -  F(q*  ,q*  )x||  ^  . 


Combining  (3.3)  with  (H5ii)  shows  that  for  some  >  0 

||S(t,q)y  -  S(t;q*)y||  <  fC||y||,  0  <  t  <  T,  y  e  Y, 

whenever  |q  -  q*(  <  6^.  In  particular,  putting  y  =  (DF(q)hl(s)  6  Y  by 
(H5i)  we  obtain 

ll(S(t-s;q)  -  S(t-s;q*)  l(DF(q)hl(s)l|  <  fC|| !  DF(  q  )h  ]  (  s  )  || 

for  |q  -  q* I  <  6^,  a.e.  s  €  (0,T).  Since  DF(q)h  is  continuous  at  q*,  ther 

exist  constants  ,  S  >0  such  that 
2  2 

||DF(q)h||j  <  for  |q  -  q*l  <  . 

Combining  these  estimates  shows  that  the  first  term  in  (3.2)  is  bounded 
by  fCKj  if  |q  -  q*  |  <  min(<5j,^j). 

Using  (H8)  it  is  easy  to  see  that  there  exists  6^  >  0  such  that  the 
second  term  in  (3.2)  is  bounded  by  eC  for  |q  -  q*|  <  These  estimates 

complete  the  proof  of  Lemma  3.1. 

Proof  of  Lemma  3.2.  By  Theorem  3  of  (111,  D^(Q(t;q)|  exists  for  q  e  P  and 

(3.4)  D^lQ(t;q)]  -  D^lQ(t;q*)l 

=  f  [S(t-s;q)  -  S(t-8;q*) llDG(q)(s) Ids 

‘’o 

+  f  S(t-s;q*)((DG(q))(s)  -  (DG(q* ) ) ( s ) ]ds 

‘'o 

where  u  has  been  suppressed  in  the  notation.  Since  DG(q)  €  L*(0,T;Y)  for 
q  e  P  by  (H6),  the  proof  follows  exactly  as  in  the  proof  of  Lemma  3.1. 
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Proof  of  Lemma  3.3.  Let  f  >  0  be  given.  By  (3.3)  and  (Fill)  there  exists 
>  0  such  that 

||S(t;q)y  -  S(t;q*)yl|  <  CKj|y|||q  -  q*  | 

for  y  €  Y  and  jq  -  q*|  <  6^.  Since  DF(q)h  e  l'(0,T;Y)  by  (H5i)  we  have  as 

in  the  proof  of  Lemma  2.1  that  the  first  term  of  (3.2)  is  bounded  by 

-  q*|  for  |q  -  q*|  <  min  An  estimate  of  the  same  form  is 

easily  obtained  for  the  second  term  of  (3.2)  using  (H12) .  These  estimates 
complete  the  proof  of  Lemma  3.3. 

Proof  of  Lemma  3.4.  Since  DG(q)u  e  L*(0,T;Y)  by  (H6),  the  proof  follows 
exactly  as  in  the  proof  of  Lemma  3.3  using  (3.4)  in  place  of  (3.2). 

4.  Application  to  a  Delay-Differential  F/quation 

In  this  section  we  apply  the  framework  of  the  previous  sections  to  the 
linear  delay-differential  equation 

n 

(  x(t)=  a_x(t)  +  E  a,x(t  -  q.)  +  u(t) 

®  k=l  ^ 

(4.1)  x(0)  =  T] 

[  Xq  =  V?. 

Let  P  =  R^,  fix  r  >  0,  and  let  P  =  {q  =  (q^.q^,  .  .  .  ,  q^)  :  0  <  q^^  <  r 

for  k  =  1,2,.  .  .,n}.  In  equation  (4.1),  ty  e  R,  aj^  e  R,  k  =  0,1,.  .  .,n, 

ip  e  L*(-r,  0)  with  norm  denoted  by  IMI,.  u  €  L*(0,T),  and  Xj^(s)  =  x(t+s) 

for  t  >  0,  -r  <  s  <  0.  By  a  solution  of  (4.1)  we  mean  a  function  x  which 

is  absolutely  continuous  on  lO,Tl  and  satisfies  (4.1)  almost  everywhere  on 
(0,T). 


Following  the  construction  in  [14],  we  take  X  =  IR  x  L‘(-r,0)  with  norm 
l|(n.*P)ll  =  Inl  +  IMIj  define  for  q  €  P  an  operator  A(q)  on 

D  =  {(»7,V?)  e  X:  ^  is  abs .  cont.  on  |-r,0|,  !p  e  L'(-r,0),  and 
^(0)  =  r}} 


A(q)(r;,V3)  =  (  eiQ<fi(0)  +  E  ^  ) 

k=l 


Then  is  well  known  that  A(q)  generates  a  strongly  continuous  semigroup 
S(t;q)  on  X  satisfying  S(t;q)  =  (v(t),  y^^)  where  y(t)  =  y(t;q)  denotes  the 

solution  of  (4.1)  with  u  =  0.  It  is  a  consequence  of  standard  results  that 
(HI)  -  (H3)  hold  in  this  setting. 

For  q  =  (q^,.  .  •  .q^)  *>nd  q^  in  P,  6  X,  and  w  e  l’(0,T)  it 

follows  that  in  this  example  the  mappings  F  and  G  of  Section  3  are  given  by 


(4.2)  =  (  E  t-q,^;qQ) ,  0  ) 

k=l 


(4.3)  fG(q,Qg)w](t)  =  (  E  a|^z(  t-qj^;qQ) ,  0  ) 

k=l 


for  a.e.  t  e  (0,T)  where  z(t;q)  denotes  the  solution  of  (4.1)  with  u  =  w 
and  {rj,ip)  =  (0,0).  It  is  shown  in  [11]  that  these  mappings  satisfy 
(H4)  -  (H6)  with  the  closed  subspace  Y  =  R  x  {0}.  It  is  also  shown  in  [11] 
that  F  and  G  satisfy  the  differentiability  hypotheses  (H7)  and  (H9)  for 
=  X-  6  D  and  q,q-  e  P.  Furthermore,  their  Frechet  derivatives  are 
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given  by 

n 

(4.4)  [DF(q)hi(t)  =  (  -  H  ^  ^ 

k=l 

and 

n 

(4.5)  [DG(q)bl(t)  =  (  -  E  aj^z(  t-qj^;qj^)hj^,  0  ) 

){=1 

for  q  e  P,  h  =  (h^,.  .  •  .h^)  6  r'',  where  y(t;q)  is  the  solution  of  (4.1) 

with  u  =  0  and  z(t;q)  is  the  solution  of  (4.1)  with  iT},<p)  -  (0,0). 

It  remains  to  establish  conditions  under  which  (118),  (HIO)  -  (H13)  are 
satisfied. 

Lemma  4.1.  Fix  q*  =  (q*,.  .  ><1*)  €  P  and  €  D.  Then  F(q,q*)xQ  as 

defined  by  (4.2)  satisfies  (Hll). 

Proof :  In  Section  5  of  (111  it  is  shown  that  there  is  a  constant  such 
that 

||F(q.q*)(r7,0)  -  F(q*  ,q* )  (r/.O)]!  ^  <  Cjh  [  ll(r7,0)|| 

n 

for  q  e  P,  h  e  R*’,  r;  G  R.  Here  we  define  |h|  =  E  |h.  |  .  This  estimate  is 

k=l 

equivalent  to  (Hll)  with  Y  =  R  x  {0}. 

Lemma  4.2.  Suppose  x^  =  (t/,^)  g  D.  Then  DF(q)  as  given  by  (4.4)  satisfies 


(H8).  Moreover,  if  in  addition  ^  is  of  bounded  variation  on  (-r,0l,  then 
DF(q)  satisfies  (Hll). 


Proof :  Let  A  =  max  |a,  |  and  |h|  =  max  |h,  |.  Then  we  obtain  the  estimate 

m  k  ^  k  ^ 


n  ^ 

(4.6)  ||DF(q)h  -  DF(q*)h|l  <  A^|h|  E  f  |y(t-q.  ;q)  -  y(  t-q.  ;q*)  |dt 

k=ro 


n  „T 


A^|h|^E  J  |y(t-qk;q*)  -  y(t-qk;q*) |dt. 


Now  from  (4.1)  we  obtain 


T  T 

(4.7)  f  |y(t-q. ;q)  -  y( t-q. ;q*) |dt  <  f  |y(t;q)  -  y(t;q*)|dt 
‘’O  ‘'O 

n  pT 

<  A  E  |y(t-q.;q)  -  y( t-qt ;q*) jdt 

n  pT 

<  A  E  ly(t-q.;q)  -  y( t-q . ;q*) [dt 

n  pT 

+  A  E  |y(t-q.;q*)  -  y( t-q^ ;q* ) | dt 


n  pT 

<  A  E  |y(t;q)  -  y(t;q*) |dt 


n  „T 


+  A  E  r  |y(t-q.;q*)  -  y( t-q^ ;q* ) | dt , 


Now  since  y(t;q)  =  S(t;q)xQ  is  differentiable  with  respect  to  q  it  is  not 
difficult  to  show  that  there  are  constants  /?  and  6  such  that 
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T 

(4.8)  f  |y(t;q)  -  y(t;q*)|dt  <  /ijq  -  q*  [ 

V  n 


whenever  |q  -  q* [  <  6.  Combining  (4.7)  and  (4.8)  with  (4.6)  yields 


(4.9)  ||DF(q)h  -  DF(q*)h|lj  <  A*|h|n^|q  -  q*  1 

n  T 

+  A^|h|n  E  f  |y(t-q. ;q*)  -  y( t-q* ; q* ) | dt 
m  k  k 

n  T 

+  A^Ihl^E  J^|y(t-qj^;q*)  -  y ( t-q*  ; q* )  | dt 


Since  ir],ip)  e  D,  we  have  y  and  y  in  L^(-r,T).  Therefore,  the  integral 

terms  in  (4.9)  approach  zero  as  q  ->  q*  and  (H8)  holds.  If  p  is  of  bounded 

variation  on  |-r,0],  then  y  and  y  are  of  bounded  variation  on  [-r,T].  By 
(15,  Theorem  2,1.7(b)j  this  implies  that  the  integral  terms  in  (4.9)  are 
0(lq  -  q*|)  as  q  ^  q*  so  that  (Hll)  holds. 

Lemma  4.3.  Suppose  u  e  L*(0,T).  Then  DG(q)  as  defined  by  (4.5)  satisfies 

(HIO).  Moreover,  if  in  addition  u  is  of  bounded  variation  on  (0,T],  then 
DG(q)  satisfies  (H13). 

Proof :  Using  (4.5)  in  place  of  (4.4)  one  obtains  the  estimate  (4.9)  above 

with  y  replaced  by  z.  Now  if  u  e  lNo,T)  then  z  and  z  are  in  L*(-r,T)  so 
that  (HIO)  holds.  Similarly,  if  u  is  of  bounded  variation  on  10,T],  then  z 

and  z  are  of  bounded  variation  on  (-r,T]  so  that  (H13)  is  satisfied. 

5.  The  Algori thm 

In  this  section  we  define  a  parameter  estimation  algorithm  based  on 
quasi  1 inearization  and  establish  local  convergence  using  the  results  of 
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Section  3.  Here  we  assume  that  the  parameter  space  P  is  p'^with  canonical 
has ise.,  i=l,  2,.  .  .,n. 

Given  Xq  e  D  and  q  e  P  C  r”  a  strong  solution  of  (1.1)  is  given  by 
S(t;q)Xp  +  Q(t;q).  Here  we  have  used  the  notation  of  Section  2.  Let  C  be 

a  bounded  linear  mapping  from  X  into  R^  and  define 
'lf(t;q)  =  C(S(t;q)xQ  +  Q(t;q)). 


The  parameter  estimation  algorithm  is  related  to  the  following  optimization 
problem . 


I 

Problem  5.1.  Let  yj  e  R  ,  j  =  1,  2,  . 
times  tj  e  (0,  T],  j  =  1,  2,  .  .  . ,  m, 
quadratic  cost  function 


.  . ,  m  be  data  values  taken  at 
respectively.  For  q  6  P  define  the 


m  _ 

J(q)  =  E  |7(t.;q)  -  yJ^ 
j  =  l  •’  •’ 

Find  q*  e  P  such  that  J(q*)  <  J(q)  for  all  q  e  P. 

The  quasi  1 inearization  method  defines  a  recursive  algorithm  whose 
fixed  point  is  a  local  solution  of  Problem  5.1.  A  more  complete 
exposition  is  given  in  (7l.  Given  an  initial  guess  q^  6  P  define 

Qk+i  =  ^  0,1, 2, 3,  .  .  . 

f(q)  =  q  -  (D(q)l"'b(q) 

D(q)  =  E  M'^(t.;q)M(t.;q) 
j  =  l  J  J 


where 
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b(q)  =  E  M  (t.  ;q)[')f(t,  ;q)  -  yj 
j=l  J  J  J 


and  the  matrix  M(t;q)  has  its  column  M'(t;q)  given  by 


MVt;q)  =  CD^(S(t;q)xQ  +  Q(t;q)]e.,  i  =  1,2,3, 


Lemma  6.1.  Suppose  the  hypotheses  of  Lemmas  3.1  and  3.2  are  satisfied. 


Then  MHj;q)  is  continuous  in  q  at  q* . 


Proof .  This  is  a  direct  consequence  of  Lemmas  3.1  and  3.2  and  the  above 
definitions . 

Lemma  5.2.  Suppose  the  hypotheses  of  Lemmas  3.3  and  3.4  are  satisfied. 
Then  there  exist  constants  a,  6  >  0  such  that 


|M(tj;q;  -  M(tj;q*)(  <  a|q  -  q*l. 
for  |q  -  q  |  <  6 ,  j  =  1 ,2, . . . ,m. 

Proof .  This  is  a  direct  consequence  of  Lemmas  3.3  and  3.4  and  the  above 
def ini tions . 

We  can  now  prove  the  following  convergence  results.  These  results  are 
typical  of  quasi  1 inearization  methods  and  the  pioofs  given  here  are  in  the 
same  spirit  as  those  in  [7l.  We  obtain  superlinear  convergence  when  there 
is  an  exact  fit  to  data  (Theorem  5.1)  and  linear  convergence  in  the 
presence  of  error  (Theorem  5.2). 


Theorem  5.1.  Suppose  the  hypotheses  of  Lemmas  3.1  and  3.2  are  satisfied. 


Moreover,  assume  [D(q*)]  ^  exists,  f(q*)  =  q* ,  and  J(q*)  =  0.  Then  for 
every  e  >  0  there  exists  6  >  0  such  that 


Iffq;  -  <  f|q  -  q*| 


for  Iq  -  q*|  <  ^.  In  particular,  there  is  a  neighborhood  U  of  q*  such  that 
qj^  ^  q*  k  00  whenever  q^  e  U. 

Proof .  Note  that  f(q*)  =  q*  implies  that  b(q*)  =  0,  or 


(5.1) 


E  t .  ;q*)[7(t.  ;q*)  -  y.]  =  0. 

j=l  J  J  J 


Therefore 


f(q)  -  f(q*)  =  D(q)  NoCqXq  -  q*)  -  b(q)) 


=  D(q) 


-1 


E  MM  t .  ;q)[M(  t .  ;q)(q  -  q*)  -  (7(t.;q)  -  y.)] 

Lj=l  ^  ^  J  J 


-1  T 

D(q)  E  ( t . ;q)[ M( t . ;q)  -  M(t.;q*)](q  -  q*) 

j=l  ^  J  ^ 


- 1  "  T 

D(q)  *  E  M  ( t . ;q)f7( t . ;q)  -  7(t.;q*)  -  M(t.;q*)(q  -  q*)] 
j^l  J  J  J  J 


- 1  T 

D(q)  *  E  M  ( t . ;q)[7( t . ;q*)  -  y.]. 

j=l  ^  ^  ^ 
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Therefore,  using  (5.1)  we  have  that 

(5.2)  f(q)  -  f(q*)  = 

-1  ™  T 

D(q)  E  M*(t.;q)|M(t.;q)  -  M(t.;q*)l(q  -  q*) 

j=l  J  J  J 

-  D(q)  *  E  M'^(t.;q)('Y(t.;q)  -  7(t.;q*)  -  M(t.;q)(q  -  q*)] 

j^l  J  J  J  J 

-  D(q)~^  E  [M'^(t.;q)  -  M'^(t  ;q*)l['y(t  ;q*)  -  y  ]. 

j  =1  ^  J  J  J 

Note  that  D(q)  *  exists  and  is  bounded  in  a  neighborhood  of  q*  since 

D(q*)  *  exists  by  assumption  and  D(q)  *  is  continuous  at  q*  by  Lemma  5.1. 

Let  €  >  0  be  given.  Using  Lemma  5.1  it  is  easy  to  see  that  there 
exist  constants  >0  such  that  the  first  term  in  (5.2)  is  bounded  by 

f/3j(q  -  q*|  for  |q  -  q*l  <  Furthermore,  since  M(tj;q*)  is  the  Frechet 

derivative  of  7(tj;q)  at  q*,  one  can  show  that  there  exist  constants 

0  such  that  the  second  term  of  (5.2)  is  bounded  by  f/3^|q  -  q*]  for 

|q  -  ci*|  <  Combining  these  estimates  with  (5.2)  yields 

(5.3)  |f(q)  -  f(qMl  < 

eP\q  -  q*|  +  |D(p)~*|  E  |M’^(t.;q)  -  M^(t.;q*)l  |7(t.;q*)  -  y  J  , 

j^l  J  J  J  J 

for  |q  -  q*|  <  fi  =  min  ^  P^-  Since  J(q*)  =  0,  the  last 

term  in  (5.3)i8  zero.  This  estimate  yields  the  desired  result. 


The  following  theorem  does  not  require  an  exact  fit  to  data,  but  does 


place  some  technical  restrictions  on  the  behaviour  of  M  near  q*.  Note 
that  if  Lemmas  3.3  and  3.4  hold  then  there  exists  ?  >  0  such  that  for 
0  <  ^  <  J  there  exists  a  constant  K(^)  such  that 

E  |M^t.;q)  -  M'(t.;q*)|  <  K(^)|q  -  q*| 
j=l  J  J 

for  |q  -  q* I  <  6.  Let  K*  =  lim  sup  K(5)  and  define 

6  i  0 

(5.4)  A*  =  K*|D(q*)  *1  max|-7(tj  ;q*)  -  |  . 

Theorem  5.2.  Suppose  the  hypotheses  of  Lemmas  3.3  and  3.4  are  satisfied. 

Moreover,  assume  [D(q*))  *  exists  and  f(q*)  =  q*.  Let  X*  be  defined  by 
(5.4)  and  assume  A*  <  1 .  Then  there  exists  <5*  >  0  such  that 

I  f(q)  -  f(q*)  I  <  A* |q  -  q* | 

for  |<1  "  q*|  <  ■  Itt  particular,  qj^  as  k  -♦  oo  whenever 

IQq  -  q*|  <  S*. 

Proof .  This  estimate  is  a  direct  consequence  of  (5.3). 


6.  Numerical  Examples 

In  this  section  we  consider  several  examples  in  which  the  above 
algorithm  was  used  to  solve  parameter  estimation  problems  in  delay- 
differential  equations.  In  these  examples  the  emphasis  is  on  delay 
identification  since  in  the  abstract  setting  this  represents  an  unbounded 
perturbation  of  the  generator  as  noted  in  Section  4. 
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With  the  exception  of  Example  6.8,  the  various  unknown  parameters  are 
estimated  using  data  generated  from  closed-form  expressions  for  the 
solution  found  by  the  "method  of  steps".  The  algorithm  is  implemented  by 
an  averaging  scheme  [2]  which  approximates  the  state  equation  and  the 
associated  sensitivity  equations  by  a  system  of  ordinary  differential 
equations.  This  system  is  solved  by  a  fourth-order  Runge-Kutta  routine. 

In  the  one  delay  examples  the  averaging  scheme  is  implemented  with  the 
delay  interval  [-r,0]  divided  into  sixteen  equal  segments,  except  that 
Example  6.8  uses  64  equal  segments.  In  the  two  delay  examples  the 
intervals  [-r2,  -rll  and  [-rl,Ol  are  divided  into  sixteen  equal  segments. 
All  computations  were  done  on  a  VAX  11/750  minicomputer  or  a  SUN 
Microsystem  at  the  Institute  for  Computer  Applications  in  Science  and 
Engineering  (ICASE). 

Example  6.1.  This  example  illustrates  the  rapid  convergence  of  the  method 

for  a  single  unknown  parameter — the  delay  in  the  following  equation — with 
an  initial  guess  which  is  an  order  of  magnitude  greater  than  the  "true 
value"  of  r  =  1.0.  The  equation  and  the  results  of  the  iteration  are  given 
below. 


><• 

II 

-2x(t)  +  3x(t- 

r) ,  t  >  ( 

x(t)  = 

t  +  1,  t  <  0 

i terate 

r 

error 

0 

10.000 

34.056 

1 

1.299 

0.955 

2 

0.946 

0.175 

3 

0.989 

0.115 

4 

0.987 

0.115 

The  convergence  of  the  states  to  ten  data  points  on  the  interval  f0,2]  is 
illustrated  in  Figure  1. 
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Example  6.2.  The  data  is  the  same  as  for  Example  6.1,  however  in  this  case 

the  algorithm  is  asked  to  estimate  the  coefficients  as  well  as  the  delay. 
The  equation  shows  an  insensitivity  to  the  individual  coefficients  which 
leads  to  the  inaccuracy  in  the  converged  estimates.  In  fact,  because  of 
errors  introduced  by  the  averaging  scheme  for  computing  the  state,  the 
estimated  values  fit  the  data  better  than  the  "true  values"  used  to  compute 
the  data  by  the  method  of  steps.  The  "true  values"  are  a  =  -2,  b  =  3,  and 
r  =  1.  The  equation  and  the  results  of  the  iteration  are  given  below; 


(  x(t)  =  ax(t)  +  bx(t-r),  t  >  0 
lx(t)=t+l,  t<0 


i ternte 

a 

b 

r 

error 

0 

-4.000 

7.000 

2.000 

3.379 

1 

-0.815 

3.537 

1.184 

2.968 

2 

-1.596 

3.342 

1.122 

0.775 

3 

-2.403 

3.713 

1.002 

0.188 

4 

-2.250 

3.361 

1.015 

0.094 

5 

-2.352 

3.483 

1.006 

0.093 

The  convergence  of  the  states  is  illustrated  in  Figure  2. 

Example  6.3.  This  case  illustrates  the  effect  of  a  forcing  function  on  the 
state  equation.  The  nonhomogeneous  delay-differential  equation 


f  x(t)  =  ax(t)  +  bx(t-r)  +  u(t),  t  >  0 
ix(t)=t+l,  t<0 


where 


u(  t)  = 


0  ,  t  <  0.1 
1  ,  t  >  0.1 
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is  solved  in  closed  form  by  the  method  of  steps  with  parameter  values 
a  =  -2,  b  =  3,  r  =  1  as  in  Example  6.2.  The  results  of  the  parameter 
estimation  algorithm  are  given  below: 


i terate 

a 

b 

r 

error 

0 

-4.000 

7.000 

2.000 

4.0527 

1 

1.022 

3.165 

1.140 

39.2657 

2 

-2.637 

23.652 

1.168 

24.9577 

3 

-5.979 

28.631 

1.141 

11.6964 

4 

-8.034 

23 . 250 

1.118 

3.5425 

5 

-5.167 

5.417 

1.028 

2.0471 

6 

-1.239 

4.195 

1.008 

4.8981 

7 

-2.861 

6.222 

1.005 

1 . 8930 

8 

-2.485 

3.795 

0.998 

0.0819 

9 

-2.115 

3.201 

1.013 

0.0724 

10 

-2.247 

3.380 

0.998 

0.0691 

The  results  are  similar  to  those  of  Example  6.3,  except  that  the  solution 
has  become  somewhat  more  sensitive  to  the  coefficients. 

Example  6.4.  This  example  indicates  the  ability  of  the  algorithm  to 

estimate  two  unknown  delays.  The  algorithm  converges  rapidly  from  a 
relatively  poor  initial  guess.  The  "true  values”  are  r^^  =  1.0  and 

=  2.0.  The  equation  and  the  results  of  the  parameter  estimation 

algorithm  are  given  below  and  the  convergence  of  the  states  to  ten  data 
points  on  the  interval  (0,3)  is  illustrated  in  Figure  3. 


f  x(t)  =  -x(t)  +  xd-r^)  -  xit-r^),  t  >  0 
ix(t)=t+l,  t<0 
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i terate 

^2 

error 

0 

0.600 

4.000 

7.500 

1 

1.569 

3.216 

2.295 

2 

1.146 

2.100 

0.100 

3 

0.977 

1.998 

0.034 

4 

0.978 

2.003 

0.032 

Example  6.5.  The  equation  and  data  for  this  example  are  the  same  as  in 

Example  6.4.  In  this  case  the  initial  guess  reverses  the  order  of  the 
true"  delay  values.  The  results  of  this  iteration  are  given  below  and 
covergence  of  the  states  on  the  interval  [0,31  is  illustrated  in  Figure  4. 


i terate 

^2 

error 

0 

2.000 

1 .000 

2.460 

1 

0.483 

1.151 

1 .379 

2 

1.561 

2.014 

0.788 

3 

1  . 100 

2.072 

0.077 

4 

0.980 

2.002 

0.033 

Example  6.6.  In  this  case 

the  algorithm  is  asked  to  estimate  parameters  in 

a  delay  model  of  a  system 

with  no 

delay.  Ten 

data  points  on  the  interval 

(0,2|  are  computed  from  the  exponential  solution  of 

/  x(t)  =  -2x(t) 

\  x(0)  =  1 

and  the  algorithm  is  asked  to  estimate  unknown  parameters  in  the  system 

I  x(t)  =  ax(t)  +  bx(t-r),  t  >  0 
i  x( t)  =  t  +  1 ,  t  <  0 
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The  first  four  iterations  are  given  below; 


i terate 

a 

b 

r 

error 

0 

-3.000 

3.000 

2.000 

1.2577 

1 

-3.060 

-0 . 637 

1  .947 

0.2551 

2 

-1.687 

0.235 

1.981 

0.1144 

3 

-1.967 

0.025 

1.985 

0.0110 

4 

-2.000 

0.000 

1.986 

0.0001 

On  the  fifth  iteration  the  algorithm  aborted  when  it  was  asked  to  invert  a 
nearly  singular  matrix.  This  reflects  the  fact  that  at  the  true  parameter 
values  the  state  is  completely  insensitive  to  the  delay. 

Example  6.7.  This  case  is  the  same  as  the  previous  example  except  that  the 

data  is  taken  from  the  closed  form  solution  of  the  nonhomogeneous  undelayed 
equation 

t  x( t )  =  -2x( t )  +  u( t ) 

I  x(0)  =  1 

where  u  is  the  same  step  function  as  in  Example  6.3.  The  results  are 
similar  to  those  of  the  previous  example. 


i terate 

a 

b 

r 

error 

0 

-3.000 

3.000 

2.000 

1.3135 

1 

-2.848 

0.099 

1.804 

0.5121 

2 

-1.841 

0.138 

2.401 

0.0811 

3 

-1.971 

0.003 

2.508 

0.0197 

Example  6.8.  In 

this 

example 

we  consider 

the  second 

-order  equation 

f 

dt* 

2 

+  <*/ 

x( t)  +  Hq 

^(t-r)  ^  a 

jX(t-r)  = 

u( t) ,  t  >  0, 

.  x(t)  = 

1.  t 

<  0, 

-24- 


where  u(t)  is  the  step  function  of  Example  6.3.  This  equation  models  a 
harmonic  oscillator  with  retarded  damping  and  restoring  forces.  In  [13]  a 
quas i I  inear i zati on  algorithm  is  used  to  estimate  coefficients  in  this 
equation.  The  methods  of  this  paper  allow  the  delay  r  to  be  added  to  the 
set  of  unknown  parameters.  For  this  example  the  averaging  method  was  used 
to  compute  "data"  values  for  the  parameter  estimation  algorithm  with  "true" 
values  of  u  =  6,  Aq  =  2.5,  a^  =  9,  and  r  =  1.  The  results  of  the  iterative 

algorithm  are  given  below  and  the  convergence  of  the  states  (displacement 
and  velocity)  on  the  interval  (0,  2]  is  illustrated  in  Figures  5  and  6. 
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